Synthesis of stressed speech from isolated neutral speech using HMM-based models
نویسندگان
چکیده
In this study, a novel approach is proposed for modeling speech parameter variations between neutral and stressed conditions and employed in a technique for stressed speech synthesis. The proposed method consists of modeling the variations in pitch contour, voiced speech duration, and average spectral structure using Hidden Markov Models (HMMs). While HMMs have traditionally been used for recognition applications, here they are used to statistically model characteristics needed for generating pitch contour and spectral slope patterns to modify the speaking style of isolated neutral words. An algorithm is developed based on an analysis-synthesis speech model, and HMM pitch and spectral stress characteristics for stress perturbation. Informal listener evaluations of the stress modi ed speech con rm the HMMs ability to capture the parameter variations under stressed conditions. The proposed HMM models are both speaker and word-independent, but unique to each speaking style. While the modeling scheme is applicable to a variety of stress and emotional speaking styles, the evaluations presented in this study focus on angry, Lombard e ect, and loud spoken speech.
منابع مشابه
HMM-based stressed speech modeling with application to improved synthesis and recognition of isolated speech under stress
In this study, a novel approach is proposed for modeling speech parameter variations between neutral and stressed conditions and employed in a technique for stressed speech synthesis and recognition. The proposed method consists of modeling the variations in pitch contour, voiced speech duration, and average spectral structure using hidden Markov models (HMM’s). While HMM’s have traditionally b...
متن کاملSpeech enhancement based on hidden Markov model using sparse code shrinkage
This paper presents a new hidden Markov model-based (HMM-based) speech enhancement framework based on the independent component analysis (ICA). We propose analytical procedures for training clean speech and noise models by the Baum re-estimation algorithm and present a Maximum a posterior (MAP) estimator based on Laplace-Gaussian (for clean speech and noise respectively) combination in the HMM ...
متن کاملDuration and spectral based stress token generation for HMM speech recognition under stress
I n this paper, we address the problem of isolated word recognition of speech under various stressed speaking conditions. The niain objective is to formulate an alternate training algorithm for hidden Markov model recognition, which better characterizes actual speech production under stressed speaking styles such as slow, loud and Lombard effect, without the need for collecting such stressed sp...
متن کاملA novel training approach for improving speech recognition under adverse stressful conditions
This paper presents a new training approach for improving recognition of speech under emotional and environmental stress. The proposed approach consists of training a speech recognizer with synthetically generated speech under each stress condition using stress perturbation models previously formulated in [4, 1]. The perturbation models were previously formulated to statistically model the para...
متن کاملSpeech Recognition under Stress Condition
The objective of this work is to conduct a speech recognition study and evaluate the performance of the same under stressed condition. The speech recognition study is conducted both in isolated word recognition and keyword spotting approaches. The word models are built during training using speech collected from neutral condition. During testing these models are tested with speech signals colle...
متن کامل